SDA 4.1 Documentation for SDALOG
NAME
sdalog - Generate a report of SDA usage
USAGE
sdalog -g filename [options]
DESCRIPTION
SDALOG reads the SDA logfile and generates a report on SDA usage.
Note that the logfile read by SDALOG is the special file written
by SDA -- not the access log maintained by the web server
software.
The name of the logfile is specified in the SDA Manager as a
Global Specification that applies to a group of SDA datasets.
The SDALOG program rarely will be used at the command line.
Instead, it is typically invoked by the SDA Manager when useage
reports are requested. If SDALOG is used at the command line,
then it should be noted that wildcards (asterisks) in study name
or client address filters can cause issues because the shell may
attempt to expand (glob) the filter expression. Therefore, when
using wildcards in filters it is prudent to surround the filter
expression with quotes. See the examples at the bottom of this
page.
OUTPUT FORMATS
The default output format reports the following information:
- How many times each dataset was accessed
- How many times each SDA program or procedure was executed
- How many of the reported procedures were run in each month
An optional format (used if a '-c' option is specified) breaks
down usage by client address (rather than by dataset).
If the '-c all' option is used, this style of report will include
the full addresses (the hostname, if available, or else the
numeric IP).
If the '-c 1', '-c 2', or '-c 3' option is used, then only the
last one, two, or three final segments of the hostnames (if
available) are displayed. For example, the last segment of the
hostname is the top level domain name -- like 'COM' or 'EDU'. If
the '-c 1' option is specified, a summary of usage by those top
level domains will be generated.
With these options, the numeric IP addresses are not displayed
separately; rather, they are combined together and reported as a
group.
Note that the ability to display all or part of a hostname
assumes that Tomcat has been configured (when SDA was installed)
to resolve IP addresses (usually possible only for a subset of
client addresses). See the SDA Installation Guide
for more information on configuring Tomcat to show
hostnames in the SDA log file.
OPTIONS
The following command-line options are recognized. Some options
affect the logfile -- the pathname of the file, and which records
in the logfile should be included in the report. Other options
affect the output format -- whether to produce the default output
or the optional client address output.
The only required option is the specification of the name of the
logfile.
Log File Options
- -g filename
-
The specified filename is the pathname of a logfile maintained by
SDA. (REQUIRED)
- -r range_of_dates_filter
- The report can be limited to a range of dates (or a
single date). A date must be in the format MM/DD/YYYY.
For example: 12/31/2023. The year must be the full four digits.
However, single digit months or days do not need to be filled
with a leading zero. For example: 1/5/2024.
A range of dates must be separated by a hyphen.
For example: 6/1/2023-12/31/2023. A date range
specification cannot contain any spaces. This option
cannot be repeated.
- -s study_name_filter
- The report can be limited to specified study
name(s). The match is case-insensitive. Multiple
study names can be matched in one specification by using an
asterisk (*) as a wildcard. The asterisk will match any
characters (of any number). Also, multiple asterisks can be used
within a study name. For example, a study name specification of
'anes*' will match 'anes', 'anes2020', 'anes-current', etc. A
study name specification of '*nes*' will match 'nes', 'nes2000',
'anes', 'anes2004', etc. If wildcards are not used, then the
specification must match the full name of the dataset in the SDA
log file.
Note that only one study specification can be used with each -s
option. But the -s option can be repeated.
- -a address_filter
- The report can be limited to one or more client
addresses (hostname or numeric IP). The match is case-
insensitive. An asterisk (*) can be used as a wildcard to match
any characters (of any number). Multiple asterisks can be used
within a specified address. For example, a specification of
'*berkeley.edu' will match 'airbears.berkeley.edu',
'reshall.berkeley.edu', etc. (Note that due to the specificity
of client addresses it is often necessary to use wildcards to get
useful results.)
Note that only one address specification can be used with each -c
option. But the -c option can be repeated.
Output Options
-o filename
Output from SDALOG will be written to this file.
If this option is not specified, output will be routed
to the user's screen (standard output).
-c all
The report will list the full client addresses
(hostnames, if available, or numeric IP addresses if
not) of the computers used by the SDA users (instead of the
default output format). The number of procedures executed by
each client will also be reported.
This is the only option that lists numeric IP addresses
individually. The other client-based options combine all the
numeric IP addresses into a single group.
-c 3
All 3 segments of hostnames will be
reported. Example: airbears.berkeley.edu
-c 2
The last 2 segments of hostnames will be
reported. Example: berkeley.edu
-c 1
Only the last segment (top level domain)
of hostnames will be reported. Examples: 'edu' or 'com'
Miscellaneous Options
-x filename
Write lines with badly formed log entries
(if any) into this file. This option is for diagnostic purposes.
-u
Print out a list of options (but do not
execute the program)
Deprecated Options
The addition and/or enhancement of various options has made some
older options obsolete. The -f, -F, and -e options have been
deprecated and will be removed entirely from a later version of
SDALOG.
EXAMPLES
- Basic example
- sdalog -g SDAlog -o logreport.txt
- Study filter for a specific dataset (GSS2020)
- sdalog -g SDAlog -s gss2020 -o logreport.txt
- Study filter with a wildcard (asterisk). Note the quotes
around the filter expression to prevent shell expansion.
- sdalog -g SDAlog -s "gss*" -o logreport.txt
- Get the top level domains of user hostnames
- sdalog -g SDAlog -c 1 -o logreport.txt
- Address filter with wildcard (asterisk). Note the quotes
around the filter to prevent shell expansion.
- sdalog -g SDAlog -c 3 -a "*berkeley.edu" -o logreport.txt
CSM, UC Berkeley/ISA
July 12, 2024